The data contains 336 records of participants in the study, each with 51 variables. In particular, our project focused the following variables:
Baseline SSQ (BSSQ) of 16 symptoms (quantitative, discrete): self-reported symptom severity of participants before undergoing VR, on a scale of 1 to 10.
Active SSQ (ASSQ) of 16 symptoms (quantitative, discrete): self-reported symptom severity of participants after undergoing VR, on a scale of 1 to 10
The age of the participants (quantitative, discrete); they were then sorted into age groups – re-classed as ‘factor’ (qualitative, ordinal)
This was to allow the relationship between age groups and symptoms reported to be seen
Whether the participant has had VRexperience (qualitative, nominal); this was reclassified from ‘character’ into ‘factor’.
R misidentified this as ‘chr’, it should be a qualitative variable
The change (\(\Delta\)) between baseline (before) and active (after) SSQ was calculated for each participant for each symptom, and now we take the average \(\Delta\) for each symptom for each group. (quantitative, discrete)
A limitation faced with this data set is the fact that the symptoms and their severity, for both baseline and active SSQs are self-reported, meaning that the interpretation of the scale can vary from person to person, making the reported values subjective and less reliable.
Another limitation is that VR experience is simply classified by a ‘yes’ or ‘no’, which is not very descriptive of how much exposure to VR the participant has had in the past, or how recent that experience was. However, it was mentioned that participants who had used VR more than 10 times were excluded from the study.
Assumptions
We assumed that the participants were truthful and correct about whether or not they have participated in VR before. We also assumed that the participants followed the instructions in the symptoms survey correctly and reported their symptoms on a scale of 1 to 10, with 1 being the least severe and 10 being the most.
Research Question
What is the effect of having VR experience on the symptoms experienced by people?
We filtered data by weather the participant had VR experience or not, and for each symptom, we took the average change between the BSSQ and ASSQ . The spider chart above visualizes this. From initial observations, we can see that both groups experience similar \(\Delta\) (change) in most symptoms. Interestingly, those with VR experience seem to report a greater increase in most symptoms when compared to those without VR experience.
Code
library(plotly)library(tidyselect)mean_sqq_ages =c(mean(grp1$ssq_full),mean(grp2$ssq_full),mean(grp3$ssq_full),mean(grp4$ssq_full),mean(grp5$ssq_full) )fig2 <-plot_ly(type ='scatterpolar',fill ='toself',r = mean_sqq_ages,theta =c("16 to 21","22 to 29","30 to 37","38 to 45","above 45"))fig2 <- fig2 %>%layout(polar =list(radialaxis =list(visible = T,range =c(0,20) ) ),showlegend = T )fig2
No scatterpolar mode specifed:
Setting the mode to markers
Read more about this attribute -> https://plotly.com/r/reference/#scatter-mode
Code
fig3 =ggplot(data_by_age_group, aes(x = age_group, y = mean_ssq)) +geom_bar(stat ='identity')ggplotly(fig3)
filteredData =mutate(filteredData, age_class =case_when(age > q3 ~"older", age > q2 ~"old", age > q1 ~"young", age >0~"younger"))ggplot(filter(filteredData, age_class =="old"| age_class =="young"), aes(x = age_class, y = ssq_full)) +geom_boxplot()
Code
filteredData["age_group"] =cut(filteredData$age, c(16, 22, 30, 38, 46, Inf), c("16-21", "22-29", "30-37", "38-45", "45+"), include.lowest =TRUE)ggplot(filteredData, aes(x = VRexperience, y = ssq_full, fill = VRexperience)) +stat_summary(fun ="mean", geom ="bar", position ="dodge")
Code
library(lubridate)class(filteredData$Date[1])
[1] "POSIXct" "POSIXt"
Code
d = filteredData$Date[1] |>as.POSIXct()year(d) # used chatgpt for this because there is no known website in the universe that has this information for SOME REASON
[1] 2021
Code
filteredData =mutate(filteredData, "generation"=case_when(year(Date) - age >=2010~"a", year(Date) - age >=1997~"z", year(Date) - age >=1981~"y", year(Date) - age >=1965~"x", TRUE~"old"))ggplot(filteredData, aes(x = generation, y = ssq_full)) +geom_boxplot()
Code
ggplot(filteredData, aes(x = age, y = ssq_full, color = generation)) +geom_point()